Compiling for SIMD Within a Register

نویسندگان

  • Randall J. Fisher
  • Henry G. Dietz
چکیده

Although SIMD (Single Instruction stream Multiple Data stream) parallel computers have existed for decades, it is only in the past few years that a new version of SIMD has evolved: SIMD Within A Register (SWAR). Unlike other styles of SIMD hardware, SWAR models are tuned to be integrated within conventional microprocessors, using their existing memory reference and instruction handling mechanisms, with the primar y goal of improving the speed of specific multimedia operations. Because the SWAR implementations for var ious microprocessors var y widely and each is missing instructions for some SWAR operations that are needed to support a more general, por table, high-level SIMD execution model, this paper focuses on how these missing operations can be implemented using either the existing SWAR hardware or even conventional 32-bit integer instructions. In addition, SWAR offers a few new challenges for compiler optimization; these issues are briefly introduced, as are the SWARC module language and the Scc compiler.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

From SIMD to Micro-Grids

I. COMPILING TO SIMD PARALLELISM Most commodity microprocessors now support multi-media instructions. These instruction-set extensions are typically based on the Single Instruction-stream Multiple Data-stream (SIMD) model in which a single instruction causes the same mathematical operation to be carried out on many operands, or pairs of operands at the same time. The multi-media instructions on...

متن کامل

General-Purpose SIMD Within A Register: Parallel Processing On Consumer Microprocessors

Recent extensions to microprocessor instruction sets are intended to speed-up multimedia algorithms by allowing SIMD parallel processing over multiple data elds within each processor register. These extensions, while e ectively supporting hand-coding of some multimedia tasks, do not directly support a high-level parallel programming model. Unfortunately, the extensions vary widely across di ere...

متن کامل

Best Paper Awards Techniques for Reducing Read Latency of Core Bus Wrappers Code Selection for Media Processors with Simd Instructions Cost Reduction and Evaluation of a Temporary Faults Detecting Technique

The paper selected as the most outstanding in the field of CAD (track A) is: Techniques for Reducing Read Latency of Core Bus Wrappers by Roman L. Lysecki, Frank Vahid, and Tony D. Givargis of the University of California, Riverside, USA The authors address the problem of assembling cores in system-on-a-chips, without introducing extra latencies. They present a technique for automatically desig...

متن کامل

Single Instruction Multiple Data – Not Everything is a Nail for this Hammer

Hardware vendors have been struggling to fight the power and memory wall for decades [1, 2]. Since most of the processing time depends on the number of instructions, number of used registers and dependencies between instructions, but not on the size of a register, independent data items of a vector (i.e., a column) could be processed in parallel. Hence, a silver lining seems to be Single Instru...

متن کامل

Compiling Rewriting onto SIMD and MIMD/SIMD Maschines

We present compilation techniques for Simple Maude, a declarative programming language based on Rewriting Logic which supports term, graph, and object-oriented rewriting. We show how to compile various constructs of Simple Maude onto SIMD and MIMD/SIMD massively parallel architectures, and in particular onto the Rewrite Rule Machine, a special purpose MIMD/SIMD architecture for rewriting. We sh...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998